Search CORE

2 research outputs found

Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback

Author: Chae Hyungjoo
Hwang Seung-won
Kang Dongjin
Kwon Taeyoon
Moon Seungjun
Ong Kai Tzu-iunn
Song Yongho
Yeo Jinyoung
Publication venue
Publication date: 13/11/2023
Field of study

Code editing is an essential step towards reliable program synthesis to automatically correct critical errors generated from code LLMs. Recent studies have demonstrated that closed-source LLMs (i.e., ChatGPT and GPT-4) are capable of generating corrective feedback to edit erroneous inputs. However, it remains challenging for open-source code LLMs to generate feedback for code editing, since these models tend to adhere to the superficial formats of feedback and provide feedback with misleading information. Hence, the focus of our work is to leverage open-source code LLMs to generate helpful feedback with correct guidance for code editing. To this end, we present Coffee, a collected dataset specifically designed for code fixing with feedback. Using this dataset, we construct CoffeePots, a framework for COde Fixing with FEEdback via Preference-Optimized Tuning and Selection. The proposed framework aims to automatically generate helpful feedback for code editing while minimizing the potential risk of superficial feedback. The combination of Coffee and CoffeePots marks a significant advancement, achieving state-of-the-art performance on HumanEvalFix benchmark. Codes and model checkpoints are publicly available at https://github.com/Lune-Blue/COFFEE.Comment: Work in progres

arXiv.org e-Print Archive

Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents

Author: Chae Hyungjoo
Kang Dongyeop
Kim Minjin
Kwon Taeyoon
Lee Dongha
Ong Kai Tzu-iunn
Song Yongho
Yeo Jinyoung
Yu Youngjae
Publication venue
Publication date: 22/10/2023
Field of study

Human-like chatbots necessitate the use of commonsense reasoning in order to effectively comprehend and respond to implicit information present within conversations. Achieving such coherence and informativeness in responses, however, is a non-trivial task. Even for large language models (LLMs), the task of identifying and aggregating key evidence within a single hop presents a substantial challenge. This complexity arises because such evidence is scattered across multiple turns in a conversation, thus necessitating integration over multiple hops. Hence, our focus is to facilitate such multi-hop reasoning over a dialogue context, namely dialogue chain-of-thought (CoT) reasoning. To this end, we propose a knowledge distillation framework that leverages LLMs as unreliable teachers and selectively distills consistent and helpful rationales via alignment filters. We further present DOCTOR, a DialOgue Chain-of-ThOught Reasoner that provides reliable CoT rationales for response generation. We conduct extensive experiments to show that enhancing dialogue agents with high-quality rationales from DOCTOR significantly improves the quality of their responses.Comment: 25 pages, 8 figures, Accepted to EMNLP 202

arXiv.org e-Print Archive